Symmetric Indefinite Linear Solver using OpenMP Task on Multicore Architectures
نویسندگان
چکیده
Recently, the Open Multi-Processing (OpenMP) standard has incorporated task-based programming, where a function call with input and output data is treated as a task. At run time, OpenMP’s superscalar scheduler tracks the data dependencies among the tasks and executes the tasks as their dependencies are resolved. On a shared-memory architecture with multiple cores, the independent tasks are executed on different cores in parallel, thereby enabling parallel execution of a seemingly sequential code. With the emergence of many-core architectures, this type of programming paradigm is gaining attention—not only because of its simplicity, but also because it breaks the artificial synchronization points of the program and improves its thread-level parallelization. In this paper, we use these new OpenMP features to develop a portable high-performance implementation of a dense symmetric indefinite linear solver. Obtaining high performance from this kind of solver is a challenge because the symmetric pivoting, which is required to maintain numerical stability, leads to data dependencies that prevent us from using some common performance-improving techniques. To fully utilize a large number of cores through tasking, while conforming to the OpenMP standard, we describe several techniques. Our performance results on current many-core architectures—including Intel’s Broadwell, Intel’s Knights Landing, IBM’s Power8, and Arm’s ARMv8—demonstrate the portable and superior performance of our implementation compared with the Linear Algebra PACKage (LAPACK). The resulting solver is now available as a part of the PLASMA software package.
منابع مشابه
An efficient distributed randomized solver with application to large dense linear systems
Randomized algorithms are gaining ground in high performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. In this paper, we propose a randomized algorithm for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter es...
متن کاملAn efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems
Randomized algorithms are gaining ground in high-performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. We propose a randomized solver for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter estimation problems ...
متن کاملOn the Performance of an Algebraic Multigrid Solver on Multicore Clusters
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG’s performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread...
متن کاملDesign of a Multicore Sparse Cholesky Factorization Using DAGs
The rapid emergence of multicore machines has led to the need to design new algorithms that are efficient on these architectures. Here, we consider the solution of sparse symmetric positive-definite linear systems by Cholesky factorization. We were motivated by the successful division of the computation in the dense case into tasks on blocks and use of a task manager to exploit all the parallel...
متن کاملAnalyzing Performance and Power of Multicore Architecture Using Multithreaded Iterative Solver
Problem statement: Scientific modeling and simulations have been popularly used with experiments and theoretical analysis in science and engineering communities. Approach: Consequently, computational demands are growing exponentially to afford large scale modeling and simulations. Results: As a result, multicore computing architectures had been proposed and several products are already availabl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018